Creating and manipulating monte carlo spectra using MITK-MSI

In this tutorial we will learn how to

  1. create reflectance spectra from examplary tissues
  2. how to analyse and visualize the created spectra
  3. how to manipulate them

The MITK-MSI software provides a wrapper to the popular MCML approach to simulate how light travels through tissue. This wrapper can be found in mc/sim.py. In this tutorial we will utilize our tissue model which uses this wrapper to create the reflectance spectra.

As a prerequisit, you need a MCML monte carlo simulation which uses the format specified here. I tested this software with the GPU accelerated version which can be found here.


In [1]:
# 1.1 create spectra - setup simulation environment

# some necessary imports
import logging
import numpy as np
import os
# everything related to the simulation wrapper
from mc import sim
# the factories create batches (tissue samples) and suited tissue models
from mc import factories
# function which runs simulations for each wavelength
from mc.create_spectrum import create_spectrum

# Where does your monte carlo simulation executable resides in?
MCML_EXECUTABLE = "/home/wirkert/workspace/monteCarlo/gpumcml/fast-gpumcml/gpumcml.sm_20"
# The MCML needs a simulation input file, where shall it be created?
MCI_FILENAME = "./temp.mci"
# filename of the file with the simulation results. Due to a bug in GPUMCML will always
# be created in the same folder as the MCML executable
MCO_FILENAME = "temp.mco"
# The wavelengths for which we want to run our simulation
WAVELENGTHS = np.arange(450, 720, 2) * 10 ** -9

# we want to create standard colonic tissue as specified in the IPCAI 2016 publication
# "Robust Near Real-Time Estimation of Physiological Parameters from Megapixel
# Multispectral Images with Inverse Monte Carlo and Random Forest Regression"
factory = factories.ColonMuscleMeanScatteringFactory()
# if you want to create data from the generic tissue mentioned in the paper, choose:
#factory = factories.GenericMeanScatteringFactory()

# create a simulation wrapper
# the simulation wrapper wraps the mcml executable in python code
sim_wrapper = sim.SimWrapper()
# our simulation needs to know where the input file for the simulation
# shall resign (will be automatically created)
sim_wrapper.set_mci_filename(MCI_FILENAME)
# also it needs to know where the simulation executable shall lie in
sim_wrapper.set_mcml_executable(MCML_EXECUTABLE)

# create the tissue model
# it is responsible for writing the simulation input file
tissue_model = factory.create_tissue_model()
# tell it where the input file shall lie in
tissue_model.set_mci_filename(sim_wrapper.mci_filename)
# also set the output filename
tissue_model.set_mco_filename(MCO_FILENAME)
# tell it how much photons shall be simulated. Will be set to 10**6 by standard,
# this is just an example
tissue_model.set_nr_photons(10**6)

In [2]:
# 1.2 create spectra - create tissue samples for simulation

# setup batch with tissue instances which should be simulated
batch = factory.create_batch_to_simulate()
# we want to simulate ten tissue instances in this example
nr_samples = 10
df = batch.create_parameters(10)

# lets have a look at the dataframe. Each row corresponds to one tissue instance,
# each tissue instance is defined by various layers, which all have certain parameters
# like e.g. oxygenation (here sao2)
df


Out[2]:
layer0 layer1 layer2
vhb sao2 a_mie b_mie d n g vhb sao2 a_mie ... d n g vhb sao2 a_mie b_mie d n g
0 0.003780 0.447583 2425.900882 1.286 0.000907 1.36 0.872626 0.068985 0.447583 912.591151 ... 0.000503 1.36 0.948700 0.034427 0.447583 3150.494032 1.286 0.000454 1.38 0.804610
1 0.021198 0.184410 3022.116532 1.286 0.000782 1.36 0.930268 0.063804 0.184410 10.000000 ... 0.000735 1.36 0.830601 0.028247 0.184410 3101.261499 1.286 0.000430 1.38 0.813056
2 0.052096 0.307445 1171.322758 1.286 0.000897 1.36 0.924933 0.021053 0.307445 1433.788625 ... 0.000738 1.36 0.807348 0.006614 0.307445 1002.536901 1.286 0.000485 1.38 0.943837
3 0.076799 0.471304 1599.038671 1.286 0.000878 1.36 0.900637 0.084504 0.471304 3228.771326 ... 0.000605 1.36 0.905222 0.043674 0.471304 1810.114303 1.286 0.000504 1.38 0.868953
4 0.053123 0.108137 3524.480885 1.286 0.001001 1.36 0.833434 0.095382 0.108137 1988.134950 ... 0.000643 1.36 0.920940 0.031201 0.108137 1731.751283 1.286 0.000542 1.38 0.906191
5 0.009147 0.866703 372.754799 1.286 0.000943 1.36 0.841274 0.094915 0.866703 1660.695761 ... 0.000505 1.36 0.856969 0.009274 0.866703 1991.390563 1.286 0.000439 1.38 0.813544
6 0.082903 0.206032 2314.415411 1.286 0.000663 1.36 0.884492 0.038608 0.206032 459.458466 ... 0.000573 1.36 0.850574 0.063746 0.206032 2728.820678 1.286 0.000494 1.38 0.801766
7 0.040719 0.200956 1858.910121 1.286 0.000772 1.36 0.884516 0.003930 0.200956 3564.807750 ... 0.000766 1.36 0.903764 0.047257 0.200956 1429.574407 1.286 0.000468 1.38 0.849055
8 0.028810 0.729636 1751.818248 1.286 0.000886 1.36 0.813391 0.007477 0.729636 1388.286116 ... 0.000813 1.36 0.914125 0.092993 0.729636 4001.446613 1.286 0.000504 1.38 0.844552
9 0.089758 0.665397 3768.709188 1.286 0.000800 1.36 0.836711 0.015648 0.665397 1298.629577 ... 0.000620 1.36 0.927904 0.048023 0.665397 3015.289386 1.286 0.000427 1.38 0.893833

10 rows × 21 columns


In [3]:
# 1.3 create spectra - run simulation

# add reflectance column to dataframe
for w in WAVELENGTHS:
    df["reflectances", w] = np.NAN # the reflectances have not been calculated yet, thus set no nan

# for each instance in our batch
for i in range(df.shape[0]):
    # set the desired element in the dataframe to be simulated
    tissue_model.set_dataframe_row(df.loc[i, :])
    logging.info("running simulation " + str(i) + " for\n" +
                 str(tissue_model))
    reflectances = create_spectrum(tissue_model, sim_wrapper, WAVELENGTHS)
    # store in dataframe
    for r, w in zip(reflectances, WAVELENGTHS):
        df["reflectances", w][i] = r
        
# clean up temporarily created files
os.remove(MCI_FILENAME)
created_mco_file = os.path.join(os.path.split(MCML_EXECUTABLE)[0], MCO_FILENAME)
if os.path.isfile(created_mco_file):
    os.remove(created_mco_file)

# Hooray, finished,
# now our dataframe also contains reflectances for each tissue instance:
df["reflectances"]


Out[3]:
4.5e-07 4.52e-07 4.54e-07 4.56e-07 4.58e-07 4.6e-07 4.62e-07 4.64e-07 4.66e-07 4.68e-07 ... 7e-07 7.02e-07 7.04e-07 7.06e-07 7.08e-07 7.1e-07 7.12e-07 7.14e-07 7.16e-07 7.18e-07
0 0.318616 0.346943 0.371254 0.379878 0.385557 0.389087 0.394877 0.397762 0.401003 0.404130 ... 0.494731 0.495162 0.495068 0.495426 0.495447 0.495008 0.495081 0.495093 0.494623 0.494477
1 0.140266 0.189640 0.243554 0.259792 0.275593 0.284833 0.296358 0.302311 0.308877 0.314986 ... 0.446426 0.447138 0.448675 0.449469 0.450297 0.451485 0.451886 0.451767 0.452054 0.452514
2 0.021747 0.035154 0.056293 0.065113 0.073610 0.080725 0.088940 0.094565 0.100813 0.107133 ... 0.378155 0.379112 0.379953 0.381048 0.382271 0.382699 0.384109 0.384781 0.385816 0.386681
3 0.021931 0.031995 0.045955 0.051771 0.056990 0.061694 0.067690 0.071328 0.076024 0.080335 ... 0.422478 0.424793 0.425965 0.428011 0.429414 0.430371 0.432106 0.433527 0.433773 0.435063
4 0.069287 0.107174 0.159794 0.177421 0.196123 0.208005 0.221937 0.231352 0.239231 0.247642 ... 0.477140 0.478393 0.481133 0.483172 0.485144 0.486761 0.488254 0.490285 0.491700 0.492445
5 0.045606 0.051312 0.058581 0.062728 0.065359 0.068912 0.073484 0.075951 0.080196 0.084404 ... 0.382621 0.381922 0.381471 0.381451 0.380481 0.380568 0.379841 0.379317 0.378714 0.378075
6 0.027393 0.044668 0.070860 0.081027 0.091248 0.098535 0.107426 0.112543 0.118445 0.123929 ... 0.371301 0.373190 0.375097 0.377324 0.379127 0.380524 0.382481 0.383658 0.385444 0.386918
7 0.048908 0.078857 0.124928 0.142237 0.159621 0.172280 0.186836 0.196870 0.206029 0.216111 ... 0.494583 0.496394 0.497351 0.498217 0.499010 0.499452 0.501481 0.501600 0.502315 0.503171
8 0.086791 0.105141 0.125877 0.135923 0.143073 0.151115 0.160239 0.165631 0.173878 0.180887 ... 0.516076 0.516351 0.517303 0.517044 0.516727 0.516664 0.515435 0.515228 0.515323 0.514933
9 0.056338 0.071479 0.089041 0.097041 0.103308 0.109447 0.116874 0.120901 0.127804 0.133711 ... 0.506670 0.506889 0.507858 0.508143 0.508351 0.508218 0.509040 0.509576 0.508506 0.508549

10 rows × 135 columns


In [4]:
# 2.1 analyse spectra - plot reflectances

# the usual settings for plotting in ipython notebooks
import matplotlib.pylab as plt
%matplotlib inline

# let's have a look at our reflectances
df["reflectances"].T.plot(kind="line")
plt.ylabel("reflectance")
plt.xlabel("wavelengths [m]")
# put legend outside of plot
plt.gca().legend(loc='center left', bbox_to_anchor=(1, 0.5))
plt.grid()



In [5]:
# 2.1 analyse spectra - show distribution of blood volume fraction (vhb) and sao2

# now we need some special pandas functions
import pandas as pd

# we're interested in the distribution of vhb and sao2 in the first layer (layer0)
df_vhb_sao2 = df["layer0"][["vhb", "sao2"]]
# plot a scatter matrix showing the distribution of vhb and sao2.
# of course, with this little data this does not really make sense,
# however it is a useful tool for analysis if much data is available
pd.tools.plotting.scatter_matrix(df_vhb_sao2, alpha=0.75, figsize=(6, 6))
plt.show()



In [6]:
# 3.1 manipulate spectra - apply sliding average
# in 3.1 and 3.2 we will adapt the generated spectra to an imaginary imaging system
# This system has filters with 20nm bandwith (taken care of in 3.1)
# and takes multispectral images in 10nm steps (taken care of in 3.2)

# the module mc.dfmanipulations was written to provide some basic,
# often needed manipulations of the calculated spectra
# all dmfmanipulations are performed inplace, however, the df is also returned.
import mc.dfmanipulations as dfmani

# first copy to not lose our original data
df2 = df.copy()
# We apply a sliding average to our data. This is usefull if 
# we want to see e.g. how the reflectance was recorded by bands with a certain width
# a sliding average of 11 will take the five left and five right of the current reflectance
# and average. Because we take 2nm steps of reflectance in our simulation this means
# a 20nm window.
dfmani.fold_by_sliding_average(df2, 11)

# lets again plot the reflectances
df2["reflectances"].T.plot(kind="line")
plt.ylabel("reflectance")
plt.xlabel("wavelengths [m]")
# put legend outside of plot
plt.gca().legend(loc='center left', bbox_to_anchor=(1, 0.5))
plt.grid()
# we can see that the bump at 560nm is "smoother"



In [7]:
# 3.2 manipulate spectra - select certain wavelenghts

# our imaginary imaging system takes images in 10nm steps from 470 to 660nm
imaging_system_wavelengths = np.arange(470, 670, 10) * 10**-9

df3 = df2.copy()
dfmani.interpolate_wavelengths(df3, imaging_system_wavelengths)

# let's look at the newly created reflectances
df3["reflectances"].T.plot(kind="line", marker='o')
plt.ylabel("reflectance")
plt.xlabel("wavelengths [m]")
# put legend outside of plot
plt.gca().legend(loc='center left', bbox_to_anchor=(1, 0.5))
plt.grid()



In [12]:
# that's it, folks. If you want, you can save the created dataframe easily to csv:
df.to_csv("results.csv", index=False)

In [ ]:


In [ ]: